-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix "zombie" http output websocket connections #125
base: main
Are you sure you want to change the base?
Fix "zombie" http output websocket connections #125
Conversation
const ( | ||
writeWait = 10 * time.Second | ||
pongWait = 60 * time.Second | ||
pingPeriod = (pongWait * 9) / 10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those who would read it may find it strange, so here is why:
By setting pingPeriod
to 90% of pongWait
(i.e., (pongWait * 9) / 10
), we ensure that:
-
The server sends a ping before the
pongWait
deadline:- If
pongWait
is 60 seconds,pingPeriod
becomes 54 seconds. - This means the server sends a ping every 54 seconds.
- If
-
There's a buffer period for the client to respond:
- The client has 6 seconds (the remaining 10% of
pongWait
) to respond with a pong before the read deadline (pongWait
) expires.
- The client has 6 seconds (the remaining 10% of
-
Avoiding premature timeouts:
- If the
pingPeriod
were equal topongWait
, the read deadline might expire before the client has a chance to respond to the ping. - By setting
pingPeriod
slightly less thanpongWait
, we avoid this race condition.
- If the
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This all makes sense! A small nit would be we should consider adding these as configuration variables with the advanced()
flag. Think these are sensible defaults in any-case.
pingPeriod = (pongWait * 9) / 10 | ||
) | ||
|
||
ws.SetReadLimit(512) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the question, on why we need both ping/pong and Read goroutine:
- Necessity of Ping/Pong with a Read Loop:
- Idle Clients: The read loop alone cannot detect if an idle client has disconnected unexpectedly.
- Timely Detection: Ping/pong can detect unresponsive clients within a predictable timeframe (
pongWait
), regardless of message traffic. - Protocol Compliance: Utilizing ping/pong aligns with WebSocket protocol standards and best practices.
^ fixed linter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great! Thanks so much for the contribution and fix 🍱😄 Just a couple of questions/suggestions.
Mostly concerned that the ReadMessage
in that go-routine could cause a race-condition when sending a Ping
to the same socket we write the message bytes to.
Otherwise, run make fmt && make lint
on your local to sort out these linting issues 🙂
if err := ws.SetWriteDeadline(time.Now().Add(writeWait)); err != nil { | ||
h.log.Warn("Failed to set write deadline for ping: %v", err) | ||
return | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This setWriteDeadline
method on the connection doesn't ever return an error (despite that confusing function signature)
if err := ws.SetWriteDeadline(time.Now().Add(writeWait)); err != nil { | |
h.log.Warn("Failed to set write deadline for ping: %v", err) | |
return | |
} | |
ws.SetWriteDeadline(time.Now().Add(writeWait)) |
if err := ws.SetWriteDeadline(time.Now().Add(writeWait)); err != nil { | ||
writeErr = err | ||
break | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as below comment:
if err := ws.SetWriteDeadline(time.Now().Add(writeWait)); err != nil { | |
writeErr = err | |
break | |
} | |
ws.SetWriteDeadline(time.Now().Add(writeWait)); |
if err := ws.SetReadDeadline(time.Now().Add(pongWait)); err != nil { | ||
h.log.Warn("Failed to set read deadline: %v", err) | ||
return | ||
} | ||
ws.SetPongHandler(func(string) error { | ||
return ws.SetReadDeadline(time.Now().Add(pongWait)) | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it necessary to set the read deadline before creating and inside the PongHandler
?
go func() { | ||
defer close(done) | ||
for { | ||
_, _, err := ws.ReadMessage() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not 100% clued up on websockets so excuse the ignorance but will doing a ws.ReadMessage()
clear all bytes in the buffer after reading? Else, does a WriteMessage
fan-out to all consumers on the connection?
I'm concerned about a race condition where the payload we send using the below WriteMessage
ends up being consumed in this goroutine instead of to the client where intending to send this.
const ( | ||
writeWait = 10 * time.Second | ||
pongWait = 60 * time.Second | ||
pingPeriod = (pongWait * 9) / 10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This all makes sense! A small nit would be we should consider adding these as configuration variables with the advanced()
flag. Think these are sensible defaults in any-case.
At the moment http_server websocket implementation just writes to webscocket, and do not have a way to detect client disconnect proactively. More over, if client disconnects, and it fails write to it, it does not stop the handler, and continues to be inside the endless for loop, untill stream is stopped
for !h.shutSig.IsSoftStopSignalled() {
.It cause issues when you have multiple clients connected/disconnected, because essentially it fanout using round robin to them, but if you have closed "zombie" connections, they will also consume events.
It fixes the issue by using websocket native mechanisms like "ping/pong" methods. Also it adds ephemeral "Read", just to immidiately detect the client shutdown.